The Difficulties of Taxonomic Name Extraction and a Solution

نویسندگان

  • Guido Sautter
  • Klemens Böhm
چکیده

In modern biology, digitization of biosystematics publications is an important task. Extraction of taxonomic names from such documents is one of its major issues. This is because these names identify the various genera and species. This article reports on our experiences with learning techniques for this particular task. We say why established Named-Entity Recognition techniques are somewhat difficult to use in our context. One reason is that we have only very little training data available. Our experiments show that a combining approach that relies on regular expressions, heuristics, and word-level language recognition achieves very high precision and recall and allows to cope with those difficulties.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Literature-driven Curation for Taxonomic Name Databases

Digitized biodiversity literature provides a wealth of content for using biodiversity knowledge by machines. However, identifying taxonomic names and the associated semantic metadata is a difficult and labour intensive process. We present a system to support human assisted creation of semantic metadata. Information extraction techniques automatically identify taxonomic names from scanned docume...

متن کامل

Improvement of the Solvent Extraction of Rhenium from Molybdenite Roasting Dust Leaching Solution using Counter-current Extraction by a Mixer-settler Extractor (TECHNICAL NOTE)

Continuous counter-current extraction of rhenium from roasting dust leach liquor was carried out using a mixer-settler extractor. Tributylphosphate was used as the extractant diluted in kerosene. The effects of the flow rates and extraction stages number were investigated. The extraction efficiency was affected by the flow rates of the aqueous and organic phases, and its mechanism was qualitati...

متن کامل

Extraction of Zn, Mn and Co from Zn-Mn-Co-Cd-Ni containing solution using D2EHPA, CYANEX® 272 and CYANEX® 302

Effects of pH, D2EHPA, Cyanex® 302 and Cyanex® 272 on extractions of zinc, manganese and cobalt from a Zn-Mn-Co-Cd-Ni containing solution was comprehensively investigated at the room temperature. The addition of Cyanex® 302 indicated a left-shifting-effect on the extraction curve of zinc, a right-shifting-effect on the extraction curve of manganese and no effect on the extraction of cobalt. The...

متن کامل

The Effect of Kinetics Parameters on Gold Extraction by Lewis Cell: Comparison between Synthetic and Leach Solution

Modern hydrometallurgical techniques have been adopted to produce high-purity gold. Many extractants were used for the extraction of gold from various synthetic solutions from which DiButyl Carbitol (DBC) had a unique superiority. Kinetics of process was studied in this paper and the influence of several parameters such as stirring speed, agitation time, impeller type, interfacial area, [Au...

متن کامل

Solvent extraction and stripping of zinc from synthetic chloride solution in presence of manganese and cadmium as impurities

In this research work, solvent extraction and stripping of zinc ions from a Zn-Mn-Cd-bearing solution was investigated using D2EHPA as the extractant in a chloride medium. The efficiency of the extraction and stripping stages was evaluated separately, and different parameters such as the pH, extractant concentration, reaction temperature, and contact time were studied. Based on the results obta...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006